Model Selection of Symbolic Regression to Improve the Accuracy of PM2.5 Concentration Prediction
نویسندگان
چکیده
As one of the main components of haze, topics with respect to PM2.5 are coming into people’s sight recently in China. In this paper, we try to predict PM2.5 concentrations in Dalian, China via symbolic regression (SR) based on genetic programming (GP). During predicting, the key problem is how to select accurate models by proper interestingness measures. In addition to the commonly used measures, such as R-squared value, mean squared error, number of parameters, etc., we also study the effectiveness of a set of potentially useful measures, such as AIC, BIC, HQC, AICc and EDC. Besides, a new interestingness measure, namely Interestingness Elasticity (IE), is proposed in this paper. From the experimental results, we find that the new measure gains the best performance on selecting candidate models and shows promising extrapolative capability.
منابع مشابه
Spatiotemporal Estimation of PM2.5 Concentration Using Remotely Sensed Data, Machine Learning, and Optimization Algorithms
PM 2.5 (particles <2.5 μm in aerodynamic diameter) can be measured by ground station data in urban areas, but the number of these stations and their geographical coverage is limited. Therefore, these data are not adequate for calculating concentrations of Pm2.5 over a large urban area. This study aims to use Aerosol Optical Depth (AOD) satellite images and meteorological data from 2014 to 2017 ...
متن کاملModeling of the Relationships Between Spatio-Temporal Changes of Traffic Volume and Particulate Matter-2.5 Pollutant Concentration Based on Geographically Weighted Regression (GWR) and Inverse Distance Weighting (IDW) Model: A Case Study in Tehran M
Background and Aim: High concentrations of particulate matter-25 (PM2.5) have been the cause of the unhealthiest days in Tehran, Iran in recent years. This study was conducted with the aim of the spatio-temporal analysis of traffic volume and its relationship with PM2.5 pollutant concentrations in Tehran metropolis, Tehran during 2015-2018, using the Geographic Information System (GIS). Materi...
متن کاملبهرهگیری از مدل اثرات اختلاط خطی جهت پیش بینی غلظت ذرات معلق در سطح زمین: مطالعه موردی در تهران
Background and Objective: In the recent decade, critical condition of particulate matters (PMs) concentration is considered as one of the most important issues in Tehran megacity. Due to sparse spatial distribution of air quality monitoring stations and economic considerations, researchers proposed remote sensing technique as a fast and economical way to obtain complete spatial and temporal cov...
متن کاملApplication of ensemble learning techniques to model the atmospheric concentration of SO2
In view of pollution prediction modeling, the study adopts homogenous (random forest, bagging, and additive regression) and heterogeneous (voting) ensemble classifiers to predict the atmospheric concentration of Sulphur dioxide. For model validation, results were compared against widely known single base classifiers such as support vector machine, multilayer perceptron, linear regression and re...
متن کاملReal Time Pseudo-Range Correction Predicting by a Hybrid GASVM model in order to Improve RTDGPS Accuracy
Differential base station sometimes is not capable of sending correction information for minutes, due to radio interference or loss of signals. To overcome the degradation caused by the loss of Differential Global Positioning System (DGPS) Pseudo-Range Correction (PRC), predictions of PRC is possible. In this paper, the Support Vector Machine (SVM) and Genetic Algorithms (GAs) will be incorpor...
متن کامل